Caching for flash-based databases and flash-based caching for databases
نویسنده
چکیده
Database storage systems today are primarily based on two technologies: HDD (hard disk drive) and DRAM (dynamic random-access memory). It is increasingly difficult for these systems to deliver acceptable performance, due to fast expanding data volume, growing energy concern, and cost constraints. The emergence of flash memory has made cost-effective solutions possible. However, conventional storage systems are designed without the knowledge of flash memory limitations and flash device characteristics. Therefore, they can not fully exploit the potential of flash memory. This dissertation investigates two major aspects of flash-incorporated database storage systems. The first aspect is related to the buffer management issues of two-tier storage systems where flash devices are used as the primary storage, i. e., caching for flash-based databases. The second aspect is related to the mid-tier cache management issues for three-tier storage systems where flash memory is used as a page cache to speed up accesses to the slower primary storage, i. e., flash-based caching for databases. The major contributions can be summarized as follows: • It identifies the weaknesses of previously proposed buffer algorithms for flashbased storage systems and, as improvement, proposes the CFDC (clean-first dirty-clustered) algorithm, which is one of the earliest proposals addressing the flash random write problem. • It examines the parameter tuning problem, which discourages the practical use of various previously proposed buffer algorithms, and proposes the CASA (cost-aware self-adaptive) algorithm, which automatically adapts itself to the extent of device R/W asymmetry and to changing workloads at runtime. • From an architectural perspective, it empirically compares conventional storage systems and three-tier storage systems with flash as the mid-tier cache and delivers indicative implications to system designers. • It identifies the cold-page migration problem in flash-based mid-tier caching and proposes two effective solutions. The results suggest an important architectural consideration that was ignored so far—native management of flash memory by the mid-tier cache manager.
منابع مشابه
Reliable Writeback for Client-side Flash Caches
Modern data centers are increasingly using shared storage solutions for ease of management. Data is cached on the client side on inexpensive and high-capacity flash devices, helping improve performance and reduce contention on the storage side. Currently, write-through caching is used because it ensures consistency and durability under client failures, but it offers poor performance for write-h...
متن کاملFlash-Conscious Cache Population for Enterprise Database Workloads
Host-side flash caching has lately emerged as a suitable and e↵ective means of accelerating enterprise workloads. However, cache management for flash-based caching is di↵erent from traditional DRAM-based caching. A flash cache sits underneath the DRAM cache. Its position in the hierarchy combined with the unique characteristics of flash, calls for a di↵erent cache management solution. Specifica...
متن کاملDynamic and Transparent Data Tiering for In-Memory Databases in Mixed Workload Environments
Current in-memory databases clearly outperform their diskbased counterparts. In parallel, recent PCIe-connected NAND flash devices provide significantly lower access latencies than traditional disks allowing to re-introduce classical memory paging as a cost-efficient alternative to storing all data in main memory. This is further eased by new, dedicated APIs which bypass the operating system, o...
متن کاملDIDACache: A Deep Integration of Device and Application for Flash Based Key-Value Caching
In recent years, flash-based key-value cache systems have raised high interest in industry, such as Facebook’s McDipper and Twitter’s Fatcache. These cache systems typically use commercial SSDs to store and manage key-value cache data in flash. Such a practice, though simple, is inefficient due to the huge semantic gap between the key-value cache manager and the underlying flash devices. In thi...
متن کاملHost Side Caching: Solutions and Opportunities
Host side caches use a form of storage faster than disk and less expensive than DRAM to deliver the speed demanded by data intensive applications. Today, this form of storage is NAND Flash, complementing a disk-based solution. A host side cache may integrate into an existing application seamlessly. This may be realized by using an infrastructure component (such as a storage stack middleware or ...
متن کامل